Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix for join_locations #140

Merged
merged 65 commits into from
Dec 8, 2024
Merged

Bugfix for join_locations #140

merged 65 commits into from
Dec 8, 2024

Conversation

dfalster
Copy link
Member

@dfalster dfalster commented Dec 5, 2024

No description provided.

fontikar and others added 30 commits January 12, 2023 16:04
…king for new version, waiting for austraits.build update
…rough the internal data subsets for version 3.0.2 and version 4.0.0. Pivot_ needs austraits.build fixes to implenment. Some minor fix to pivot_wider for dependencies #60
Some packages needed to make plots are included in suggests. This means that core functions of package may not work.
- reuse outputs from previous function calls to reduce runtime
- reduce dataset sizes for slow functions (summarise_trait_means, trait_pivot_wider, plot_locations)
- silence some outputs

closes #62
- As documented in #79 , the Zenodo API has changed, breaking our download feature. 
- This commit updates the internals to work with the latest changes. 

Specifically: 

- the way to access json for all versions has changed (changed url structure, and for id we now use one of the record ids, rather than the conceptid)
- the call to download file has changed
- format of the API json has changed

Also

- added record id to the table of versions
- put a check in to remove "v" from any version entered by user
* changes required for v5 austraits.build

* Removed original_name for trait_pivot_wider3 for v5.0.0

* Added trait_pivot_wider for v4.x.x and code for what_version

* Making new switches for join and as_wide_table based on new versioning

* Sub switch for extract_ and recreated internal data

* Sub switches for trait_pivot_longer

* Minor fix in join_methods

* Added vars a global vars

* Removed .data calls when not needed

* Update to work with latest zenodo API (#81)


- As documented in #79 , the Zenodo API has changed, breaking our download feature. 
- This commit updates the internals to work with the latest changes. 

Specifically: 

- the way to access json for all versions has changed (changed url structure, and for id we now use one of the record ids, rather than the conceptid)
- the call to download file has changed
- format of the API json has changed

Also

- added record id to the table of versions
- put a check in to remove "v" from any version entered by user

* Recreated data so extract is passing

* Update `treatment_id` with `treatment_context_id`

* Revert "Update `treatment_id` with `treatment_context_id`"

This reverts commit 3fc6717.

* minor column name changes

Changes column names, reflecting recent changes to traits.build output.

* Update as_wide_table.R

add `any_of` to column selection within `as_wide_table` to accommodate other traits.build databases that don't have the same columns in taxon_list.csv

* Fixed getting versions and load austraits with zenodo updates and minor update with as_wide_table with removal of variable

* Fixed minor bug in get_version_latest

---------

Co-authored-by: Elizabeth Wenk <[email protected]>
Co-authored-by: Daniel Falster <[email protected]>
Co-authored-by: yangsophieee <[email protected]>
* Created lites for all main versions of AusTraits

* Passing for as_wide_table

* Passing for as_wide_table and for extract_

* expanding test suite to all 3 majors, switches for method id adjusted

* expanding test suite to all 3 majors for summarise_D

* Expanding these for all 3 versions

* Added PR trigger for dev branch
ehwenk and others added 26 commits September 27, 2024 09:59
Update previous join_ functions and create new database_create_combined_table function to replace join_all.

This includes:

    Splitting the old join_locations into two functions: join_location_coordinates to explicitly join latitude/longitude and join_location_properties to join location metadata
    For join_contexts, join_locations and join_contributors offer different output formats: many_columns, single_column_pretty, single_column_json
    For all functions add option vars = "all", which will add all columns/location properties/context properties.
    Join_contexts is reworked, using the variant that was developed on traits.build for database_create_combined_table. The old join_contexts is still required for as_wide_table (the combined table currently output via austraits.build API) and has been moved to that file.
    as_wide_table maintained for now to support austraits.build API, but will be removed in the coming months.

---------

Co-authored-by: Daniel Falster <[email protected]>
   * New generalised `extract_data` function that makes it possible to subset a traits.build database using any column in any table. (closes issue #82)
    - The function works with single values or vectors of values
    - Does not return an error for empty searches
    - For context property values, searches across all context property categories simultaneously
    - This function is now called by `extract_trait`, `extract_taxa` and `extract_dataset` (closes issue #107)

* Add `bind_databases`.
    - The function from traits.build that is used to bind together datasets has been moved to austraits, allowing one to extract multiple pieces of a database, then bind them back together into a single database (bind_datasets) (closes issue #106)
    - Remove naming of individual databases from `bind_databases`. This had been drawing the `dataset_id` for the database name, but this name then disappeared once the datasets were bound and the results seem to be identical with and without it (including using it to build austraits.build). And of course for generalised use, there is no longer a unique dataset_id for each database being bound. But it seems like in traits.build the column `dataset_id` for the table `taxonomic_updates` is mutated as part of this function - to be checked once we properly reimport this to traits.build. (closes issue #114)

* Add utility functions from traits.build
    - Add `convert_list_to_df1` (util_list_to_df1 on traits.build), `convert_list_to_df2` (util_list_to_df2 on traits.build), and `convert_df_to_list` (util_df_to_list on traits.build)

* Various documentation fixes
    - Rename `database_create_combined_table` to `flatten_database`
    - Fix bug in `flatten_database` - contributors cannot return many columns
    - Update package contributors documentation (closes issue #110)
    - Export `join_context_properties` and add separate documentation for each `join_` function
    - Updating documentation to pass R CMD check, argument names renamed, adding links in roxygen
    - Added some global variables
    - Line Breaks for extract and updated gha workflow
    - Added upload token
    - Changed `plant_trait_name` to `trait_name` in `plot_trait_distribution_beeswarm`

* Add and edit tests
    - NOTE: the function `summarise_trait_means` was still being tests on v3.0.0 and indeed looking at the code, isn't going to summarise, because it is only searching for replicates within a single observation_id; we've commented out this test and will revisit when we edit this function in 2025
---------

Co-authored-by: Fonti Kar <[email protected]>
* standardising file names

* standardising parameters throughout austraits (closes issue #111)

- replace all `@param austraits` and `@param aus_traits` with `@param database`
- replace all references to just the traits table (previously x, data, trait( with `@param trait_data`
- standardise description of parameters - currently using `@param database traits.build database (list object)` so people realise it is a relational database, but could change to `@param database traits.build database`

* standardise pipes

- change all native pipes (|>) to %>% pipes (closes issue #113)
*    extract_ functions can now work with a single table or an entire traits.build database; the single table can have any columns, any number of columns
*    required a tweak to check_compatibility to allow single tables to be declared compatible. A extra parameter was added that indicates if the incoming data is allowed to be a single table
*   lots of documentation that was supposed to be on the previous commit

Closes issue #120
* Changed class name and generic name

* Rename summarise_austraits to summarise_database

* Updated format for print message for v<5

* Updated start up message
* bug fix for `extract_data` so it works even after you have joined other columns
* edit `plot_trait_distribution_beeswarm` to work with a single table and for its message to accurately reflect which grouping variables are allowed
* Renamed file and integrated check compatability in print funciton
As described in issue #123 the function is outputting inaccurate results if presented with a database with data from multiple datasets. We are removing it until we build a more sophisticated function.

Closes issue #123
Revert to original format of `bind_databases` arguments - required for traits.build to work using furrr method
* Update README.Rmd

* Returning tibble for flatten_database

* Minor updates to dependnencies to pass R CMD check

Remove reference to `summarise_trait_means`

* Extract_data missing S3 class assign

* Updated bootswatch

---------

Co-authored-by: ehwenk <[email protected]>
* Export print function so it works and available for testing

* More tests for plot_locations

* Used cli for error message for and more test for plot_trait_

* Snapshotting the output of print.traits.build

* Testing functions in utils file and removing what_version

* Testing for get_versions_latest
Add tests for parts of functions not being tested
* Function to test database structure

* Adding new test function where appropriate

* Rebuilt embedded lite data layers so that they have the proper class attached to them

---------

Co-authored-by: Daniel Falster <[email protected]>
* Add missing cheatsheet

* Ignore from RBuild
  -  join_location_properties(format = "single_column_json") was not retaining location_name as its own column.

      * Both fixed this and added tests

   - Also, value_type was being lost during separate_trait_values - and probably has been for a long time.
@dfalster dfalster requested a review from ehwenk December 5, 2024 07:14
Copy link
Collaborator

@ehwenk ehwenk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugfix looks good.

@ehwenk ehwenk merged commit 9935b90 into master Dec 8, 2024
8 checks passed
@ehwenk ehwenk deleted the develop branch December 8, 2024 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants